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1.  The  Recognit  ion  and  Localization  Problem 


A  central  characteristic  of  advanced  applications  in  robotics  is  the  presence 
of  significant  uncertainty  about  the  identities  and  positions  of  objects  in  the 
workspace  of  the  robot.  In  simplest  terms,  if  a  robot  is  to  interact  intelligently  with 
its  environment,  it  must  know  what  objects  are  where.  This  normally  necessitates 
sensing  of  the  external  environment  as  a  means  of  obtaining  the  information  needed 
to  solve  the  recognition  and  localization  problem.  The  process  of  sensing  can  be 
loosely  divided  into  two  stages:  first,  the  measurements  of  properties  of  the  objects 
in  the  environment,  and  second,  the  interpretation  of  those  measurements.  Since 
the  sensory  information  could  come  from  a  variety  of  very  different  sources,  for 
example,  tactile,  ranging,  sonar  or  vision,  both  binary  and  grey- level,  it  is  important 
to  derive  recognition  and  localization  techniques  that  solve  the  interpretation  stage 
of  the  sensing  process  with  very  few  assumptions  on  the  sensory  measurements 
themselves.  In  this  article,  we  assume  only  that  the  sensory  data  is  characterized  as 
sparse,  noisy  measurements  of  the  local  geometry  of  a  small  patch  of  the  object’s 
surface,  for  example,  the  position  and  orientation  of  a  small  planar  patch  of  the 
surface  in  some  coordinate  frame  defined  relative  to  the  sensor. 

Given  these  simple  data  elements  derived  from  the  sensory  data,  the  problem 
of  model-based  recognition  and  localization  essentially  can  be  considered  as  one 
of  searching  a  space  for  a  consistent  matching  between  these  data  elements  on 
the  one  hand,  and  model  elements  representing  known  objects  on  the  other  hand. 
Since  the  data  elements  are  assumed  to  approximate  the  local  geometry  of  a  small 
planar  patch  of  the  surface,  initially  we  assume  that  the  objects  can  be  modeled 
as  polyhedra  having  up  to  six  degrees  of  freedom  relative  to  the  sensor.  The  goal 
is  then  to  define  a  matching  process  whereby  the  space  of  possible  interpretations 
of  the  sensory  data  can  be  searched  for  a  consistent  matching  of  sensory  data  to 
model  elements. 

Since,  in  general,  the  search  space  is  far  too  large  to  explicitly  explore,  the  key 
to  the  problem  is  to  derive  constraints  on  the  data  that  will  efficiently  restrict  the 
portions  of  the  search  space  that  must  be  explored.  In  this  paper,  we  present  a  set  of 
general  criteria  on  these  constraints  and  derive  a  specific  set  of  such  constraints  for 
the  case  of  geometric  sensor  measurements.  Using  this  particular  example,  we  show 
that  these  constraints  are  complete  both  for  the  simpler  case  or  three  degrees  of 
freedom  (isolated  objects  in  stable  positions)  and  for  the  general  case  of  six  degrees 
of  freedom.  We  also  show  that  the  constraints  are  exhaustive  for  the  three  degree 
of  freedom  case,  but  not  for  six  degrees  of  freedom.  Hy  exhaustive,  we  mean  that 
modulo  errors  in  the  measurement  process,  the  measured  constraints  arc  sufficient 
to  completely'  determine  the  relative  configuration  of  the  sensory  data. 

The  main  result  is  establishing  theoretical  bounds  on  the  effectiveness  of  local 
constraints  in  controlling  the  combinatorics  of  the  search  process.  Wc  stress  that 
the  results  obtained  hold  in  large  part  independent  of  the  specific  constraints  used. 
For  the  particular  constraints  derived  here,  we  also  compute  explicit  values  for 
the  predicted  bounds  and  compare  them  with  empirical  evidence  derived  earlier 
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in  Crimson  and  Lozano-Perez  l9S  ti.  Finally,  several  predictions  of  the  theory  are 
discussed,  including  the  degradation  of  the  technique  with  increased  error,  and  the 
number  of  sensory  points  generally  needed  to  guarantee  a  unique  interpretation  of 
the  data. 

1.1.  The  Basic  Problem 

The  recognition  and  localization  of  objects  from  sensory  data  is  a  centra, 
problem  of  most  advanced  robotics  situations,  it  is  usually  convenient  to  pose 
the  problem  as  one  of  search,  that  is,  given  a  set  of  known  modcis.  \se  identify 
and  locate  the  particular  object  that  wo  are  sensing  by  searching,  a  iarge  space  of 
possible  solutions  until  we  find  one  (or  all  solutions)  that  matches  the  information 
available  to  us  from  tin;  sensors.  One  of  the  main  diilicuiUes  with  the  p.-oohmi,  true 
of  most  search  problems,  is  that  the  space  of  possible  solutions  is  usually  extremely 
large,  and  one  seeks  methods  that  will  effectively  reduce  the  portions  of  the  search 
space  that  must  be  explicitly  explored.  The  problem  is  further  compounded  in  the 
case  considered  here  by  the  fact  that  the  sensory  data  against  which  a  match  is 
sought  are  typically  inaccurate,  so  that  the  matching  process  must  be  tolerant  to 
errors  in  the  data. 

The  critical  issue  in  searching  for  a  match  between  sensory  data  and  ooject 
models  is  controlling  the  potential  combinatorial  explosion  of  the  search.  There 
have  been  a  wide  variety  of  techniques  applied  to  the  recognition  problem,  all 
attempting  in  some  manner  to  control  this  explosion.  We  can  distinguish  three 
general  classes  of  approaches,  although  these  distinctions  are  i.ot  hard  and  fast, 
of  course.  The  three  general  classes  are  (i)  matching  complete  descriptions  of  the 
object  obtained  from  the  sensory  data  to  complete  model  descriptions.  vii)  mateniag 
partial  descriptions  of  the  data  to  partial  descriptions  of  the  model,  and  (id) 
matching  partial  descriptions  of  the  data  to  complete  descriptions  of  the  model. 

The  basic,  idea  behind  matching  complete  descriptions  to  complete  descriptions 
is  to  reduce  the  combinatorics  by  computing  compact  represe  iilTltlOIiS  o!  il  Mot 

of  sensory  data  and  comparing  this  representation  to  a  similar  object  mode!.  In  this 
manner,  the  matching  process  is  constrained  to  a  small  number  of  components  and 
the  combinatorics  is  significantly  reduced.  Kx. a  tuples  include  generalized  cylinder 
representations  (see  for  example.  No  vat  i  a  197-!;  Xevatia  and  Binj'ord  1977:  Marr  and 
.Whihar.i  1978;  Brooks  BiM),  and  extended  Gaussian  images  (see  for  example,  Brou 
19.83;  Horn  1983;  Horn  and  Ikeuchi  1983;  Ikcuchi  1983).  Besides  the  combinatorial 
advantage  of  comparing  them,  complete  descriptions  have  several  other  potential 
advantages.  First,  by  building  the  representation  from  a  large,  dense  set  of  sensory 
data,  it  is  likely  to  lie  insensitive  to  errors  in  the  individual  data  elements.  Second, 
even  if  only  parts  of  the  object  are  available  to  the  sensor,  the  use  of  complete 
representations  makes  ii  likely  that,  recognition  can  stiil  proceed  on  the  partial 
data.  Ot  course,  to  some  extent,  tiie  reduction  in  computational  cost  achieved  by 
matching  compact  descriptions  is  offset  by  the  additional  cost  of  processing  the 
raw  sensory  data  to  obtain  those  representations.  We  also  note  that  nut  all  global 
representations  share  the  advantages  cited  above.  For  example,  one  of  the  earliest 
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techniques  for  recognition  uses  global  properties  of  binary  images,  such  as  area, 
perimeter,  elongation  and  Euler  number,  an  approach  common  to  many  commercial 
systems  (see  for  example,  Rausch  and  bomb  197b;  Gleason  and  Agin  1979;  Machine 
Intelligence  Corporation  1980;  Reinhold  and  Vanderbrug  1980).  These  particular 
parameters  do  not  extend  well  to  overlapping  or  occluded  parts,  and  thus  such 
techniques  do  not  demonstrate  the  same  versatility  as  the  previous  ones. 

A  second  approach  to  the  problem  is  to  use  input  tokens  (frequently  called 
features)  to  the  matching  process  that  are  very  distinctive.  Typically,  if  one  can 
obtain  distinctive  features,  then  the  search  reduces  to  a  straightforward  depth  first 
exploration  of  an  interpretation  tree,  with  very  little  backtracking  involved.  For 
example,  if  I  wanted  to  recognize  a  soft  drink  can  from  visual  data,  I  could  process 
the  image  to  obtain  the  UPC  bar  code,  which  would  uniquely  identify  the  type  of 
can.  Moreover,  knowing  the  position  of  the  UPC  code  on  the  can  and  in  the  image 
would  allow  me  to  determine  the  position  and  attitude  of  the  can  in  the  scene. 
Of  course,  not  all  features  will  be  as  distinctive  as  a  UPC  code.  Simpler  examples 
might  include  corners,  holes,  notches  and  other  local  features.  The  idea,  however, 
is  that  very  few  such  distinctive  features  should  be  needed  to  identify  the  object, 
and  the  search  space  can  be  effectively  collapsed.  Examples  of  techniques  in  this 
vein  include  the  use  of  a  few  extended  features  [Perkins  1978;  Ballard  1981],  or  the 
use  of  one  feature  as  a  focus,  with  the  search  restricted  to  a  few  nearby  features 
[Tsuji  and  Nakamura  1975;  Holland  197G;  Sugihara  1979;  Bolles  and  Cain  1982; 
Holies,  Horaud  and  Hannah  1983] 

There  are  several  drawbacks  to  such  distinctive  feature  approaches,  although 
they  also  have  many  strong  points.  First,  while  the  cost  of  the  search  process  has 
been  greatly  reduced  in  this  case,  it  is  usually  at  the  expense  of  the  processing  of 
the  sensory  data.  In  other  words,  the  unique  features  required  to  straightforwardly 
identify  the  object  are  usually  not  directly  provided  by  the  sensors.  Rather,  the  data 
provided  by  the  sensors  is  typically  point  data,  for  example,  local  measurements 
of  the  position  and/or  orientation  of  points  on  a  surface.  Each  such  local  point 
measurement  is  clearly  not  very  distinctive,  and  to  create  distinct  collections  or 
features  requires  additional  preprocessing  of  the  data.  In  the  example  of  the  UPC 
code,  the  visual  input  must  be  processed  to  extract  the  code  from  the  rest  of  the 
data.  Thus,  feature  driven  recognition  techniques  usually  reduce  the  computational 
expense  of  searching  a  space  of  solutions  at  the  increased  cost  of  computing  the 
tokens  to  be  matched  between  sensory  data  and  model  elements. 

Not  only  is  there  additional  computational  cost  for  computing  appropriate 
features,  but  the  computation  may  also  be  sensitive  to  sensor  errors.  In  the  example 
of  the  UPC  code,  if  the  imaging  device  is  out  of  focus,  causing  the  image  of  the 
bar  code  to  blur  significantly,  the  recognition  process  may  no  longer  succeed.  Our 
preference  is  for  recognition  techniques  that  degrade  gracefully  with  noise,  rather 
than  suddenly  collapsing  under  the  influence  of  sensor  error. 

Clearly  these  problems  do  not  rule  out  feature  driven  recognition  schemes,  as 
is  evidenced  in  our  example  of  the  soft  drink  can,  by  the  proliferation  of  automatic 
check-out  counters  in  supermarkets.  A  more  critical  drawback  concerns  the  density, 
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or  rather,  sparsity  of  such  features  in  the  sensory  data.  By  definition,  features  well 
suited  for  recognition  and  localization  must  be  sparse  on  the  object,  since  otherwise 
there  would  be  multiple  interpretations  of  the  attitude  of  the  object  relative  to 
the  sensor.  While  this  eases  the  recognition  task,  it  also  reduces  the  situations  in 
which  the  process  can  be  applied.  In  particular,  it  requires  that  the  initial  sensory 
data  be  dense,  in  order  to  have  a  reasonable  expectation  that  the  feature  extraction 
stage  will  actually  find  a  set  of  useful  features  from  the  data.  This  requirement  on 
dense  sensing  modalities  rules  out  some  types  of  sensors,  such  as  tactile,  that  are 
inherently  sparse  in  nature.  If  possible,  we  would  prefer  a  recognition  technique 
that  is  sensor- independent. 

The  sparsity  of  features  is  also  a  problem  when  dealing  with  occlusion.  If  an 
object  is  presented  to  the  sensor  in  isolation,  then  large  portions  of  its  surface 
will  be  visible  and  the  probability  of  detecting  appropriate  features  is  high,  if  the 
object  is  partially  occluded  by  other  objects,  however,  this  may  not  be  true,  in 
our  example  of  a  soft  drink  can,  if  some  other  object  occludes  the  UPC  bar  code 
from  the  sensor,  we  will  not  be  able  to  recognize  the  can.  Note  that  this  may  occur 
even  though  virtually  all  the  rest  of  the  can  is  visible  to  the  sensor.  This  follows 
in  general  from  the  sparsity  of  useful  features.  If  possible,  w'c  would  like  to  use 
recognition  techniques  that  are  not  dependent  on  particular  localized  features  of 
the  objects,  and  are  thus,  still  capable  to  performing  recognition  and  localization 
on  partially  occluded  objects. 

In  summary,  our  main  concern  with  distinctive  feature  techniques  is  that 
because  they  are  matching  partial  descriptions  to  partial  descriptions,  it  may 
occasionally  be  difficult  to  guarantee  the  computability  of  such  descriptions  from 
the  raw  data.  This  implies  that  the  inherent  sparseness  of  such  features  on  an 
object  may  cause  problems  for  sparse  sensors,  such  as  tactile  sensors,  and  may 
cause  problems  in  situations  involving  occlusion. 

A  third  approach  to  the  recognition  and  localization  problem  is  to  use  the 
local  point  measurements  available  from  the  sensors  as  the  basic  matching  tokens. 
Of  course,  in  some  sense  these  are  also  features,  but  they  do  not  sufior  the  same 
problems  due  to  sparseness,  since  they  are  dense  on  the  object.  Since  these  data 
elements  are  very  simple,  taken  individually  they  are  not  likely  to  uniquely  identify 
the  object  being  sensed.  Thus  the  search  part  of  the  process  becomes  much  more 
critical,  and  we  will  need  to  use  local  constraints  between  such  point  measurements 
to  restrict  tire  search  space.  While  the  size  of  the  search  space  explored  in  (his 
ease  will  be  larger  than  in  the  feature  based  methods,  the  expectation  is  that  the 
unit  cost  of  searching  the  space  can  be  reduced  significantly  enough  to  overcome 
the  number  of  additional  elements  tested  in  the  search.  Representative  examples  of 
such  schemes  include  Faugeras  and  Hebert  [1983],  Gaston  and  Lozano  Perez  198'lj, 
Grimson  and  Lozano-Perez  [1984]  and  Stockman  and  lOsteva  [I9SI]. 

The  key  distinction  between  schemes  that  match  partial  descriptions  using 
low  level  sensory  measurements  and  schemes  that  match  partial  descriptions  based 
on  distinctive  features  lies  in  the  computability  of  the  sensory  data  descriptions. 
Simple,  low-level  sensor  measurements  are  likely  to  be  dense  over  the  object.  As 
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a  consequence,  recognition  schemes  based  on  such  simple  measurements  should  be 
applicable  to  sparse  sensors,  and  should  be  insensitive  to  problems  of  occlusion  and 
sensor  error,  since  an  input  description  can  always  be  obtained  and  matched  to  the 
model.  As  the  sensor  features  to  be  matched  become  more  distinctive,  and  lienee 
sparser  on  the  object,  the  probability  that  they  may  not  be  detected  by  the  sensor, 
cither  due  to  the  characteristics  of  the  sensor  or  to  problems  of  occlusion  rises.  In 
this  paper,  we  will  explore  the  use  of  recognition  schemes  that  use  low  level  sensor 
primitives  that  can  be  computed  over  the  entire  object.  We  will,  however,  consider 
using  only  a  sparse  set  of  such  measurements,  in  order  to  keep  the  combinatorics 
of  the  search  process  reasonably  controlled. 

1.2.  Assumptions  and  Approach 

As  a  consequence  of  this  discussion,  we  will  assume  that  the  basic,  sensory  data 
available  consists  of  local  estimates  of  three-dimensional  positions  and  orientations 
of  small  patches  of  the  object  surface.  In  this  case,  we  can  make  very  simple 
assumptions  about  the  elements  of  the  object  models  needed  for  matching.  In 
particular,  since  the  data  elements  are  measuring  the  local  geometry  of  small 
patches  of  the  object  surface,  we  assume  that  the  object  models  are  also  constructed 
of  small  local  patches.  Thus,  our  two  assumptions  about  the  elements  to  be  matched 
between  sensory  input  and  object  models  are: 

1.  The  objects  are  all  modeled  as  polyhedra  having  six  degrees  of  freedom. 

2.  The  sensory  data  available  to  the  process  include  positions  of  points  on 
the  object,  to  within  some  volume  of  error,  and  surface  orientations  at 
those  points,  to  within  some  cone  of  error. 

The  basic  approach  to  the  problem  is  to  determine  the  set  of  positions  and 
orientations  of  an  object  that  arc  consistent  with  this  sensed  data.  If  there  are  no 
consistent  positions  and  orientations,  then  the  object  can  be  excluded  from  the  set 
of  possible  objects.  The  elements  to  be  matched  are  thus  simple  local  patches  of  a 
surface. 

The  technique  proceeds  in  two  steps: 

1.  Generate  Feasible  Interpretations:  A  set  of  feasible  interpretations  of  the 
sense  data  is  constructed.  Interpretations  consist  of  pairings  of  each  sensed 
point  with  some  object  surface  on  one  of  the  known  objects.  Interpretations 
inconsistent  with  local  constraints,  derived  from  the  model,  on  the  sense 
data  are  discarded. 

2.  Model  Test:  The  feasible  interpretations  are  tested  for  consistency  with 
surface  equations  obtained  from  the  object  models.  An  interpretation  is 
legal  if  it  is  possible  to  solve  for  a  rotation  and  translation  that  would 
place  each  sense  point  on  an  object  surface.  The  sensed  point  must  lie 
inside  the  object  face,  not  just  on  the  surface  defined  by  the  equation. 

There  arc  several  possible  methods  of  actually  searching  for  consistent  matches. 
For  example,  Crimson  and  Lozano- Perez.  1 1  fJS  I J  chose  to  structure  the  search  as 
the  generation  and  exploration  of  an  interpretation  tree.  That  is,  starting  at  a  root 
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node,  we  construct  a  tree  in  a  depth  first  fashion,  assigning  tiaia  points  to  mood 
faces.  At  the  first  level  of  the  tree,  we  consider  assigning  the  hrst  data  point  to 
all  possible  faces,  at  the  next  level,  we  assign  the  second  data  point  to  ah  possible 
faces,  and  so  on. 

Clearly,  the  first  step  is  the  key  to  this  process.  The  n>.mL-r  of  possible 
interpretations  given  *•  sensed  points  and  n  surfaces  is  n'\  Tt.erefore,  it  ,s  not 
feasible  to  explore  the  entire  search  space  in  order  to  appiv  a  mode;  test  to 
atl  possible  interpretations.  Moreover,  since  tin;  computation  of  cooruin.-.a-  frame 
t ranslorinat iotis  tends  to  be  expensive,  we  want  to  apply  this  part  of  tuc  uvh..a;*»e 
only  as  needed.  Tlte  goal  of  the  recognition  algorithm  is  if, us  to  exploit,  aa.ii 
constraints  on  the  sensed  data  so  as  to  minimize  the  number  of  interpret  at  .»,«>  that 
neeu  testing,  while  keeping  the  computational  cost  of  each  constraint  snitw.  in  the 
cast1  of  the  interpretation  t ree,  we  need  constraints  between  the  data  elements  and 
the  model  elements  that  will  allow  us  to  remove  entire  subtrees  from  consideration 
Without  explicitly  hating  to  search  those  subtrees. 

In  searching  for  appropriate  constraints  to  apply  to  the  generation  stage, 
several  criteria  are  appropriate. 

1 .  The  constraints  snouid  be  coordinate  frame  independent.  Tftai  is,  we  wouiu 
like  to  derive  constraints  that  remove  large  portions  of  the  search  space, 
independent  of  the  particular  orientation  of  the  object.  This  suggests  that 
the  constraints  should  embody  restrictions  due  to  the  shape  of  the  object, 
and  not  due  to  the  specifics  of  the  sensing  geometry. 

2.  The  constraints:  shoutd  be  simple  and  have  low  computational  cost. 

3.  The  constraints  should,  at  the  same  time,  be  as  powerful  as  possible  in 
the  sense  of  removing  large  portions  of  the  overall  search  space. 

•t.  The  constraints  should  degrade  gracefully  in  the  presence  of  error  in  the 
sensory  measurements. 

n.  The  constraints  should  be  independent  of  the  specifics  of  the  sensor  from 
which  the  data  came,  so  that  they  will  apply  equally  to  different  sensing 
modalities. 

These  constraints  are  very  similar  to  those  suggested  by  Marr  and  \ishihara 
(sec  aiso  Marr  iuK2  ). 

A  Specific  Set  of  Local  Constraints 

Wi  begin  our  analysis  by  first  deriving  a  set  of  coordinate  frame  indi-pende  it 
i  oust  ratals,  that  were  first  presented  in  ^Irimson  and  l.o/ano-l  Yt>v  iS.vi  .  The 
lirat  quest  ion  posed  is  what  types  of  coordinate-frame-  independent  i  oust  r. lints 
are  possible,  given  that  the  «<  nsory  data  is  characterized  as  sparse  points,  r.u  ti 
cond-iing  of  a  position  measurement  and  a  unit  surface  normal  bee  Figure 
(  ie.irlv  a  single  dal  a  point  provides  no  constraint  on  (lie  possible  baa  -  from  she 
model  tt.nl  could  consistent  iy  be  assigned  to  it.  Thus,  we  iook  at  pairs  »,f  sen:  orv 
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I'igure  I.  The  constraints  between  pairs  of  sensory  points. 

points,  and  the  basic  information  available  from  these  points  consists  of  a  pair  of 
unit  normals,  as  wad!  as  the  vector  separating  their  bases,  as  shown  in  Figure  t. 

One  way  to  get  coordinate-frame-independcnt  constraints  is  to  construct  a  local 
coordinate  frame  relative  to  the  configuration  itse'f.  Thus,  wo  can  use  each  of  the  two 
unit  normals  as  axes  of  the  local  coordina'e  frame.  It'  two  dimensions,  these  define  a 
local  system,  except  in  the  degenerate  case  of  the  unit  normals  being  ( ant i-)para!!el. 
In  three  dimensions,  the  third  component  of  the  local  coordinate  frame  can  be 
taken  as  the  unit  vector  in  the  direction  of  the  cross  product  of  the  first  'wo  basis 
vectors.  Given  such  a  local  basis,  clearly  one  set  of  coordinate-frame-independcnt 
measurements  provided  by  the  configuration  of  Figure  1  is  the  components  of  the 
separation  vector  along  each  of  the  basis  directions.  (Mote  that  the  use  of  the 
distance  and  two  of  the  components  is  equivalent,  up  to  a  possible  sign  ambiguity, 
to  using  the  three  components  of  the  vector.)  Additionally,  the  angle  between  the 
two  basis  vectors  is  also  specific  only  to  the  local  coordinate  frame.  More  formally,  if 
the  unit  vectors  arc  denoted  bv  n,,  n  >  and  the  vector  separating  the  two  points  is  d, 
then  one  set  of  coordina'e frame-independent  measurements  of  this  configuration 
is 
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where  n  is  a  unit  vector  in  the  direction  of  n i  X  im.  (Note  that  these  measurements 
are  isomorphic  to  the  set  used  in  [Crimson  ami  Lozano- Perez  198-1..) 

'l'hese  are  coordinate  frame- independent  measurements  on  the  configuration 
defined  by  a  p:ii r  of  sensory  points.  To  turn  them  into  constraints  on  the  search 
process,  we  must  map  them  into  equivalent  measurements  on  the  model  elements. 
Since  « ach  object  is  modeled  as  a  complex  poiyhedra,  the  mapping  is  i airly 
straightforward,  l  or  example,  consider  the  first  measurement,  n,  ■  n -.  In  orr.er  for 
these  two  sensory  ;>.... its  :o  be  consistent  with  a  pair  of  faces  on  an  object,  me  dot 
product  of  ttie  i  or;.  spanning  face  normrus  must  agree  with  tins  measurement.  Vi.  .s, 
hv  precomput  ;ng  tne  atip.es  between  ail  pairs  ol  laces  on  an  object,  ,nis  sen-ory 
a  .1  ■- .» r  iiicit v  c.ti  :u-  i.st  '■  v-‘  v.u-t  ram  l tn*  >v n r c h  lor  it  \ » 1  * ■  r x j r *  lulion .  1 1\ 

particular,  if  a  p.i.r  of  ,»•  ..  ts  is  inconsistent  with  a  particular  assignment  of  faces 
to  thos-'  points,  t  he  entire  sum  roe  of  the  interpretation  tree  lying  boiov  ’he  node 


corresponding  to  that  as- .ptum  at  can  be  ignored,  thereby  reducing  ’  ...  mat  of 

searching  requir'd.  bandar  constraints  can  be  derived  for  the  con  ms  of  the 
separation  vector  m  i ;.*•  t.ri  c'.ons  of  the  unit  normals.  That  is.  <  ach  pair  of 
faces  the  model,  oi.e  c.i  .  pr  -compute  tiie  range  of  values  of  trie  c;  out  of  a 
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pos.'  .oris  having  ■  . . : . :  on  the  first  face  and  the  other  oadp  mi:  .m  t  ,.e  second. 


A  :.  :,i.  for  a.  assignment  of  r.sory  points  to  faces  to  be  cons. -".ora.  it  must  he 
• ;.  ■  i  a-.-  tin  it  : ;i  ■  coo.-nm  e- frame- independent  sensory  metis. ire::, r mi  :nu-.t  agree 
w  .'  :i  pn  comp;: :  e.t  m.  ,,  ■  values.  .  ims,  we  iiave  define, i  one  pos.-ime  matching 
:>:•  -;va  i.etwo.  n  •■'iisory  areinents  and  models,  simitar  to  that  present.  o  in 

>.r  n  and  i.o/.ar.,'-iVr.'i  S-i, 


3.  Th-  (  oust  r aiuts  >ati-.fy  Our  Criteria 

We  neg.n  on r  :  i.eor.-i  .cm  investigation  oy  establishing  tii.it  the  constraints 
derived  above  1 1 ■  * •  t  the  criteria  of  simplicity,  coordinate- frame-independence, 
•.■mp.ctun  s:>  amt  <  x:ta..si.veaess  We  will  then  consider  their  combim.tori.il  power 
.mi  their  degradation  wit h  error. 

c>.  1 .  ('oordinai  e  - 1  rail.,  - S independence 

t  .,  . . r i y  hr.  *fu  ,-p .a  ;  ;.e  ocrivat .oi,,  tne  const ra .nts  arc  .  d . .... :  -fr..;:.e 

: n>.ej ••  mient.  !  c <’o’, -r.  ..Iso  satisfy  our  requirement  of  simp. a  sty  Oota.iimg 

!..e  'ii'i.ry  i . :  1 1 :  ol  ;  ne  con-!  mint  is  straigiiu  or  ward,  and  tin-  i ..... .  .  of  i  he 

c,>  era, ;ns  c.i  *.i  pr-c.-m.  ....  d  dir.et.y  Iron,  the  model.  The  m  pro.  c-s 

it  -  •  , :  ill  n.'ci  me-  a  :  i  p !  •  i  aide  -  look  up  process,  which  ai-u  sali-.n  -  I  In  notion  of 
-duplicity  ..mi  c., mpnt.it  ,.,;iai  ease. 

if. '2.  ('nrn  piet  mess  of  tiie  ('.distraint  s 
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independent  coordinate  frame  free  constraints  between  pairs  of  season  points  that 
are  not  already  incorporated  into  these  particular  constraints.  This  ran  be  easily 
established  by  t  lie  billowin''  argument.  Consider  the  configuration  'dust  rated  in 
Figure  1 .  Suppose  we  const  ruet  some  ar bit  rary  global  coordinate  frame,  having  ' ree 
rotational  degrees  of  I'roedom  0,0,'.'  and  three  translational  degrees  nr  ,rc>-dom, 
given  by  the  components  of  t he  vector  pi ,  relative  to  t he  described  conhgur.i'  nm.  To 
complete  desc  ribe  the  configuration  shown  will  require  a  tola!  of  10  equations,  three 
to  specify  the  base  position  pi,  three  more  to  specify  the  separation  vector  cl,  and 
two  each  to  specify  the  relative  attitudes  of  ri|,nj.  These  ten  equations  are  •  a: ;  licit 
functions  of  several  parameters,  including  the  three  angular  and  three  t  ran-Tt' io.ual 
degrees  of  freedom  described  above.  Thus,  to  reduce  this  set  of  ]()  constrain1'  to  a  set 
of  coordinate-frame- independent  constraints,  we  must  resolve  the  set  of  eqnnt  ions 
to  remove  the  explicit  dependence  on  the  parameters  related  to  the  speciT  choice 
of  global  coordinate  frame.  Clearly  this  will  require  at  least  six  equat'ons,  and 
thus  there  are  at  most  four  coordinate- frame- independent  const  mbits  given  by  this 
particular  pairwise  configuration  of  sensory  points. 

In  the  simpler  case  of  two  dimensions,  we  find  that  there  are  three  coordinate- 
frame-independent  constraints.  Thus,  we  see  that  in  both  cases,  the  constraints 
outlined  above  are  complete. 

3.3.  Exhaustiveness 

While  the  above  analysis  indicates  that  the  set  of  constraints  in  complete,  in 
that  there  art'  no  additional  independent  constraints  possible,  we  can  also  examine 
the  exhaustiveness  of  these  constraints.  In  particular,  we  can  ask  the  following 
question.  Given  the  configuration  of  Figure  !.  suppose  we  know  the  position  and 
orientation  of  one  of  the  endpoints  and  its  orientation  vector  in  some  coordinate 
frame.  Does  the  information  provided  by  the  •<  -a!  constraints  of  the  above  section 
uniquely  determine  the  position  and  orientation  of  the  second  point  uniquely’’  In 
other  words,  ignoring  issues  of  error,  how  we!!  do  the  local  measurements  restrict 
the  possible  interpretations  of  the  data!*  Bv  counting  degrees  of  freedom,  we  would 
expect  the  second  point  *o  be  uniquely  determined  for  the  two  dimensional  case, 
but  not  for  the  three  dimensional  one.  We  now  proceed  to  establish  this  claim. 

3.3.1.  Two  Dimensions 

We  first  consider  the  case  of  two  dimensions.  Given  one  sensory  point,  consisting 
of  a  position  vector  and  an  associated  unit  norma!  vector,  we  construct  a  relative 
coordinate  system  with  origin  at  the  end  of  the  position  vector  and  with  -  y  axis 
oriented  along  the  unit  normal.  Thus,  a  second  sensory  point,  can  be  characterized 
in  this  space  by  a  unit  vector  n  offset  by  some  other  vector  p,  which  we  represent 
by  the  ordered  pair 


To  determine  the  exhaustiveness  of  the  constraints,  we  need  to  show  that  the  pair 
fp.n)  is  uniquely  determined  by  the  following  measurable  parameters: 

1.  The  angle  0  between  the  unit  normals. 
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2.  The  components  m,  an<i  r.\ of  the  vector  p  relative  to  the  two  unit 
normals  respectively. 

By  definition  of  the  ioca!  coordinate  system,  m  -  (0,  —  i’,.  and  since  Tie  angle 
0  can  be  determined  from  the  sensory  measurements,  the  second  unit  normal  is 
also  completely  determined,  re*  (sin  0,  —  cos 0). 

To  begin,  we  ignore  the  special  degenerate  case  of 

(ni  •  n2)2  =  1. 

in  this  case  the  vector  p  can  be  represented  by 

P  =  QIl  1  T  /Jno 

where  a  and  are  parameters  to  be  determined.  By  taking  the  dot  product  of  p 
with  each  of  the  unit  normal  vectors,  we  obtain  the  following  system  of  equations: 

nr  -r  /?  cos  0  —  m\ 
a  cos  0  r  3  —  m^- 

Since  cos 0  ^  -u  1 ,  we  ran  solve  this  system  for  a  and  f?,  yielding 

in i  -  cos0rn2  rn 2  —  cos  0m\ 

P  —  -  , - IT  + - - - n2. 

sin"  0  sin“  0 

d  ims,  the  ordered  pair  (p,nj  is  completely  determined  by  ibe  measured  values. 

C’leoinetrically,  we  note  that  the  component  rri|  constrains  p  to  'lie  on  the  'line 
y  — ■  rn  <  as  shown  in  Figure  2.  Since  the  orientation  of  the  second  unit  normal  is 
known,  the  final  constraint  comes  from  the  component  of  the  vector  between  ihc 
two  sensed  point s  in  the  direction  of  this  normal,  in  particular,  that  -  p  ■  n  rn->. 
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In  Figure  2,  this  corresponds  to  finding  the  ray  perpendicular  to  n  at  a  specified 
distance  that  contains  the  origin.  Yhis  ray  completely  determines  the  geometry. 

In  the  degenerate  case  of  the  two  unit  normals  being  (ant i-)parallel,  we  can 
use  the  magnitude  of  the  distance  vector  d  =  \/P  '  P  as  an  additional  constraint 
in  place  of  the  now  redundant  constraint  p  •  no  =  m->.  In  this  case,  the  distance 
constrains  p  to  lie  on  a  circle  of  radius  d.  Combining  this  with  the  constraint  given 
by  the  component  ?ri\  restricts  p  to  one  of  two  positions,  and  there  are  two  solutions 

(p,  n)  —  -  mf,  —mi  j, 

where  |m[|  =  |m2|. 

Thus,  over  the  set  of  all  possible  relative  orientations  of  faces,  the  constraints 
for  the  two  dimensional  case  are  almost  always  complete.  This  is  illustrated  in 
Figure  2. 

3.3.2.  Three  Dimensions 

We  can  consider  a  similar  analysis  in  the  three  dimensional  case.  As  before, 
we  construct  a  relative  coordinate  system  with  origin  at  the  first  position  vector 
and  with  the  —z  axis  point  along  the  surface  normal.  The  x  and  y  axes  can  be 
oriented  arbitrarily.  Here,  we  have  a  five  degree  of  freedom  problem,  since  we  must 
determine  the  ordered  pair 

(P.n) 

where  p  has  three  degrees  of  freedom 

P  = (*,y,z) 

and  where  n  has  two  degrees  of  freedom 

n  =  (sin  0  cos  0,  sin  0  sin  0,  cos  <f> )  —  7r  <  0  <  tt,  0  <  <£  <  7r. 

The  constraints  in  this  case  are: 

1.  The  dot  product  between  the  two  surface  normals,  e. 

2.  The  components  mi  and  m2  of  the  vector  p  relative  to  the  two  unit 
normals,  n  1  and  ri2,  respectively. 

3.  The  component  m;t  of  the  vector  p  relative  to  the  cross  product  of  the 
two  unit  normals,  nj  X  n2. 

As  in  the  two  dimensional  case,  we  exclude  the  degenerate  case  of 

(nt  •  n2)2  =  1. 

A  similar  analysis  holds  in  that  case,  using  the  distance  constraint  d  —  p  •  p 
instead. 

The  constraints  essentially  supply  four  equations  in  five  unknowns,  z,y,z,9,<j> 
that  wc  wish  to  resolve.  The  equations  are  given  by 
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—  Z  =  m  l 

sin  <f>(x  cos  0  +  y  sin  0)  +  z  cos  <j>  —  mj 
sin  4>(x  sin  0  —  y  cos  0)  =  m3 
—  cos  4>  —  e. 

Straightforward  algebraic  manipulation  yields  the  following  parameterized  solution: 

=  —  —  (m3  sin  0  +  [m2  —  cmi]  cos#) 
sin  <p 

1/(0)  —  .  —  (—m3  cos  0  -f  [m3  —  cmji  sin  0) 
sin  <f> 

z  --=  —mi 

n((?)  =  (sin  <t>  cos  0 ,  sin  <^sin  &,  —e) 
where  simp  can  take  on  one  of  two  values, 


In  more  geometric  terms,  this  particular  resolution  of  the  equations  restricts  p  to 
lie  on  a  circle,  and  for  each  point  on  that  circle,  there  are  two  possible  values  for  the 
orientation  of  the  surface  normal  n.  Thus,  we  see  that  these  pairwise  constraints 
alone  do  not  completely  determine  the  conhguration  of  a  pair  of  sensory  points, 
even  in  the  general  case. 

4.  A  General  Theoretical  Basis 

The  key  theoretical  issue  still  to  be  settled  is  the  combinatorial  power  of  the 
local  constraints  in  reducing  the  number  of  consistent  hypotheses.  In  this  section, 
we  develop  a  theoretical  basis  for  analyzing  the  combinatorics  of  the  recognition 
process.  We  will  see  that  while  worst  case  bounds  on  the  number  of  interpretations 
consistent  with  sparse  data  tend  to  be  very  weak,  expected  case  bounds  turn  out  to 
be  very  strong,  and  are  in  fact  supported  by  empirical  evidence.  \\c  will  begin  our 
study  by  first  considering  the  three  degree  of  freedom  case.  We  will  then  extend 
the  analysis  is  a  straightforward  manner  to  the  full  six  degree  of  freedom  case.  The 
first  question  we  consider  is  that  of  deriving  theoretical  bounds  on  the  efficiency 
of  the  constraints  in  restricting  the  search  space.  We  will  derive  bounds  that  are 
independent  of  the  specific  constraints  described  above,  and  then  show  how  well 
these  particular  constraints  perform  in  restricting  the  search  space. 

■1,1.  Relative  Configuration  Space  (RC-spar.e) 

Our  general  method  of  analysis  can  be  summarized  in  the  following  manner. 
Since  al!  of  the  constraints  available  to  the  recognition  process  are  pairwise 
constraints,  we  have  no  information  to  apply  to  the  assignment  of  a  lace  to  the  first 
sense  point,  /).  Hence,  we  will  arbitrarily  assign  a  face  /,  to  this  point,  compute  a 
bound  on  the  number  of  interpretations  consistent  with  that  choice,  and  then  sum 
over  all  such  assignments  of  To  obtain  a  bound  on  the  number  of  interpretations, 
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we  will  use  the  notion  of  a  relative  configuration  space  (RC-space).  Throughout 
the  analysis  we  will  consider  two  separate  cases.  The  first  is  one  in  which  the 
objects  are  non-overlapping  and  lie  in  stable  positions.  This  case  is  essentially  a 
two  dimensional  one,  and  the  objects  have  three  degrees  of  freedom  relative  to 
their  models,  two  translational  and  one  rotational.  The  second  case  is  one  in  which 
the  objec's  are  arbitrarily  oriented  in  space.  'Phis  is  a  three  dimensional  problem 
and  the  objects  here  have  six  degrees  of  freedom  relative  to  their  models,  three 
translational  and  three  rotational. 

In  the  two  dimensional  case,  we  define  an  RC-space  in  the  following  manner. 
Each  face  of  an  object  is  defined  by  a  two-dimensional  position  (given  by  the 
position  of  the  midpoint  of  the  face)  and  an  orientation  (given  by  the  orientation  of 
the  normal  of  the  face).  'To  ease  the  analysis,  we  will  assume  that  all  of  the  edges 
of  the  object  (or  polygon)  have  equal  length  i.  For  a  three  dimensional  object,  we 
assume  that  all  of  the  faces  are  squares  of  side  l. 

Given  a  face  /,  assigned  to  the  first  point  P\,  we  can  define  a  three  dimensional 
coordinate  system  relative  to  this  face,  consisting  of  two  coordinates  of  spatial 
extent  and  one  of  angular  extent.  The  origin  of  the  two  spatial  dimensions  is  set 
at  the  midpoint  of  the  base  face  (/,)  (with  the  face  extending  along  the  x  axis  and 
with  the  normal  of  the  face  pointing  in  the  —y  direction)  and  the  origin  of  the 
angular  dimension  is  set  to  the  orientation  of  the  norma!  of  the  base  faee.  Thus  we 
have  defined  a  configuration  space  relative  to  the  orientation  of  a  particular  face. 
The  position  of  another  face  is  completely  specified  by  the  position  and  orientation 
of  its  midpoint  in  this  RC-spacc. 

In  the  three  dimensional  case,  we  construct  a  5  dimensional  relative  configuration 
space,  based  on  the  first  face  /,.  This  RC-space  has  three  positional  dimensions 
and  two  orientation  dimensions  (since  only  two  angles  are  needed  to  specify  the 
orientation  of  a  unit  vector).  Here,  the  origin  of  the  spatial  components  is  defined 
to  be  the  midpoint  of  the  base  face,  with  the  surface  normal  pointing  in  the  —z 
direction.  The  edges  of  the  face  arc  aligned  with  the  x  and  y  axes.  Thus,  the  position 
of  any  other  faee  is  given  by  the  position  of  its  midpoint  in  this  RC-spacc.  The 
rotational  components  of  a  face  are  defined  by  two  different  angles,  — 7r  <  0  <  w 
and  0  <  <p  <  7r,  where  6  describes  (he  elevation  of  the  unit  normal  relative  to  the 
—2  axis,  and  0  describes  the  orientation  of  that  unit  normal  in  the  x  —  y  plane. 

To  obtain  bounds  on  the  effectiveness  of  local  constraints  in  pruning  the  search 
space  of  feasible  interpretations,  we  need  to  map  both  the  models  and  the  sensory 
information  into  this  RC-space.  By  enumerating  the  intersection  of  these  two 
mappings,  we  will  be  able  to  analyze  the  combinatorial  efficiency  of  the  constraints. 

Clearly,  given  a  base  face  for  the  RC-spacc,  each  additional  face  of  the  model 
is  represented  by  the  position  of  its  midpoint  in  this  space,  and  that  is  given  by 
the  spatial  offset  of  its  midpoint  relative  to  the  midpoint  of  the  base  face,  and 
the  angular  variation  of  its  normal,  relative  to  the  normal  of  the  base  face.  Thus, 
each  face  projects  to  a  point  in  this  RC-space,  and  the  entire  object  is  given  by  a 
scattered  collection  of  such  points. 
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Consider  now  what  the  sensory  constraints  tell  us  about  the  portions  of 
possible  faces  in  this  RC-space.  With  no  sensory  constraint,  each  face  of  the  object 
could  lie  anywhere  within  the  RC-space.  The  sensory  constraints  on  distance  and 
relative  angle,  however,  will  restrict  the  set  of  possible  faces  to  lie  within  a  reduced 
volume  in  RC-space.  Thus,  giv  ti  the  assignment  of  the  first  point  to  face  /,,  the 
seco/id  lace  can  be  restricted  to  lie  within  some  finite  volume  of  RC-space.  Clearly, 
this  should  generally  place  a  restriction  on  the  number  of  faces  consistent  with  the 
sensory  data,  and  our  goal  is  to  obtain  bounds  on  this  restriction,  by  considering 
the  characteristics  of  this  restricted  volume  in  RC-space. 

In  more  detail,  we  consider  what  the  effects  of  the  sensory  constraints  are  in 
tiie  three  degree  of  freedom  case.  In  particular,  given  a  second  sense  point  /b,  we 
know  the  following  facts. 

1.  Distance.  The  measured  distance  between  the  two  sense  points  clearly 
restricts  the  position  components  of  the  second  face  to  lie  within  a 
restricted  area.  This  follows  from  the  observation  that  if  the  base  of  the 
vector  connecting  the  two  sense  points  must  lie  somewhere  on  the  first 
face,  and  the  length  of  the  vector  is  given  by  a  measured  distance  d, 
which  is  accurate  to  within  a  range  c,  then  the  set  of  positions  in  which 
the  midpoint  of  the  second  face  can  lie  is  restricted  to  iie  within  an 
area  specified  by  d  and  (.  (We  will  see  later  one  method  for  analytically 
specifying  this  area.) 

2.  Angle  of  the  vector.  The  measured  components  of  the  vector  between 
sensed  points  relative  to  the  measured  surface  normals  define  the  angle 
between  those  vectors  0.  This  angle  is  subject  to  a  range  of  values,  which 
in  this  case  is  a  function  of  d,  c  and  7,  the  bound  on  errors  in  measuring 
surface  normals. 

3.  Angle  of  the  normal.  The  angle  between  the  surface  normals  at  the  first 
and  second  faces  can  be  restricted  to  some  measured  value  plus  an  error 
range  defined  by  7. 

As  a  consequence  of  these  constraints,  we  see  that  if  the  first  point  lies 
somewhere  on  the  base  face  /,,  then  the  second  point  must  iie  within  some  volume 

V(d,r,f,0,t/>,  7) 

of  Rf, ’-spare.  Note  that  three  of  the  parameters  defining  this  volume  are  measured 
values  and  three  are  global  parameters  set  by  the  object  and  the  error  sensitivities 
of  the  measuring  device.  To  reflect  this,  we  rewrite  the  volume  of  RC-space  as 

v,,/l7K  0,  tp). 

.Vote  that  this  symbol  represents  a  region  in  a  three-dimensional  RC-space.  We  will 
use 

to  denote  the  magnitude  of  this  volume. 

(liven  that,  face  /,  is  initially  assigned  to  the  first  sense  point,  we  let 


(Irimson 


( 'ombin.'itorir.H  of  Modi  J- )iii*i<l  Itifoijnilion 


Pi{x,y,<l>) 

denote  the  distribution  of  faces  relative  to  this  assignment.  Thus,  p,  is  a  sum  of  n 
delta  functions,  each  of  which  reflects  the  configuration  of  a  face  relative  to  /,,  and 


L  iy  i[h~ 0  Pl^X' V’  ^  dx  dy  dd>  ~ 


If  vve  let  denote  the  sensed  measurements  between  points  /',  and 

Pj,  then  the  number  of  interpretations  consistent  with  k  +  1  points  of  sensory  data, 
given  face  /,  assigned  to  P\  is  bounded  by 


Pi[x,  y,  4>)  dx  dy  d<f> 


and  the  total  number  of  interpretations,  over  all  possible  assignments  of  the  first 
point  is  bounded  by 


n  1 


e  n 


Pt{^,y,d>)dx  dy  d<i>. 


(0 


Clearly,  in  the  worst  case,  this  is  bounded  above  by  n  ■  nk  =  n*4’1,  as  would  be 
expected.  We  now  consider  means  for  obtaining  better  bounds. 

Note  that  the  expression  above  only  considers  the  effects  of  the  sensory 
constraints  between  the  first  and  the  point,  and  ignores  the  effects  of  intermediate 
points  in  constraining  the  possible  values  for  the  j^1  face.  Clearly,  a  better  bound 
would  be  given  by 


pi{x,y,4 i)  dxdyd<{>. 


(2) 


Even  here,  the  worst  case  behavior  is  still  n*  +  l. 

There  arc  several  ways  we  could  obtain  tighter  bounds.  One  is  to  note  that  for 
most  realistic  objects,  there  is  a  minimum  distance  between  differentfac.es,  and  thus 
the  density  of  faces  in  ItC-space  has  an  upper  limit.  'This  could  be  used  to  place 
bounds  on  the  expressions  derived  above.  The  second  method  is  to  seek  expected 
bounds,  that  is,  bounds  that  will  hold  in  general,  so  that  the  class  of  objects  for 
which  the  bound  is  exceeded  is  small,  and  hopefully,  has  measure  zero,  yielding  an 
almost  everywhere  bound.  'Phis  relies  in  some  sense  on  a  type  of  limiting  behavior 
assumption  as  follows.  Suppose  we  consider  k  different  polygons,  each  with  n  faces 
and  diameter  D.  Then  in  the  limit  as  k  tends  to  oc,  the  distribution  of  the  nk 
faces  tends  towards  a  uniform  distribution.  If  wc  then  average  over  k,  we  can  assert 
that  the  expected  distribution  of  faces  is  also  uniform,  over  an  area  defined  by  the 
diameter  of  the  object,  D. 


Thus,  we  let  vp  denote  the  magnitude  of  the  total  usable  volume  of  ItC-space, 
given  by 


vp 


.ir l)~  2ir. 
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The  key  point  of  this  assumption  of  uniform  distribution  is  that  the  integral  of  the 
distribution  func  tion  will  depend  only  on  the  magnitude  of  the  swept  volume,  and 
not  on  its  specific  position  in  RC-space, 


I  (*hij  ,  Wn;  ) 


Pi{x,  y,4>)  fix  dy  dip 


,1, "i  {fin  i  j  >  ^  mj  >  V  mj  j 

vr 


Thus,  if  we  can  bound  the  magnitude  of  the  volume 


Vi,f  ^ nij  j  Ipmj) 

independent  of  the  specific  sensory  measurements,  say, 

r  i 

v\,t,  7  = 

L  J 

then  equation  (l)  leads  to  the  expected  bound  of 


L 

This  is  still  a  relatively  weak  expected  bound,  since  it  is  basic., it;.  <  i  t  r. ,■  fmm  a  *. 
I'n less  nc  <  1  this  bound  will  increase  with  increasing  k,  even  ;i\u:  \  Tn  .> 
if  the  volume  of  RC-space  swept  out  by  a  single  set  of  pairwise  eon.-t r.m.ts  is  less 
than  times  the  total  available  volume,  the  expected  number  of  interpret  a! as 
will  decrease  with  increasing  sensory  information. 


We  should  be  able  to  obtain  a  tighter  bound  by  using  equation  V:.h 
equation  reflects  that  fact  that  while  the  constraints  between  .he  nr-t  and  ,.,ru 
sensory  points  yield  a  volume  for  possible  choices  for  the  third  f.u  <■  of  ,m 

interpretation,  the  constraints  between  the  second  and  third  sensory  point'  v. . t .  m 
general  yield  a  second  volume  of  possible  choices  for  the  third  fac.  In  g.  .u  rn,  n.e 
intersection  of  these  two  volumes  will  be  a  smaller  volume  of  o  t.siste.n!  ehoi.-es 
for  the  third  face.  This  observation,  of  course,  extends  for  nil  h.riher  face"  ,n  ,.n 
interpretation.  In  order  to  preside  bounds  on  the  magnitude  of  tin.  inter--. .  t.  u 
volume,  independent  of  the  position  and  orientation  of  the  additional -< -nsory  . 
relative  to  the  first  one,  we  consider  the  foilowdng  technique. 

First,  we  decouple  the  spatial  and  angular  components  of  the  R<' -volume, 


V 


'/n 


Vf, 


(g)V 


a 

<Ar 


We  let  m  denote  the  maximum  separation  of  points  lying  in  the  s,iai,.u  pm, am  of 
the  volume,  V*.  Then  wc  can  define  C,yi7  to  be  the  cube  (square  m  the  .  as.  . ■!'  t  ,vn 
spatial  dimensions)  of  side  m  centered  about  the  center  of  mass  of  V;',  Fmady, 
we  let 


In  intuitive  terms,  V*  represents  a  larger  volume  of  RC-space  that  ioniums  the 
volume  of  possible  configurations  for  a  sensory  point,  independent  of  the  spe<  die 
orientation  of  previous  sensory  points.  As  a  consequence, 
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j  t  i  l 

Q m  j  t  V*tw  j)  —  0  Vi,trf{dinj>  ji  V’mj  )' 

m  -- 1  tn  -  -  I 

In  order  to  bound  the  magnitude  of  this  new  volume,  we  rely  on  the  following  two 
lemmas. 


Lemma  l.  Consider  a  parameter  range  of  extent  a  and  a  serond  parameter 
range  of  extent  b.  If  the  two  ranges  must  overlap  and  the  ranges  are  otherwise 
independent,  the  expected  extent  of  overlap  is 

ab 

a  4*  b 

Proof:  Without  loss  of  generality,  assume  that  b  >  a.  Place  the  origin  at  the 
beginning  of  the  a  range.  Then  the  right  limit  of  the  6  range  can  lie  between  0  and 
a  +  b  with  uniform  probability.  Thus,  the  expected  overlap  is  simply  given  by 


/(J1  xdx  t-  (6  —  u)a  +  /An  *  ^(a  +  b  —  x)dx 

which  evaluates  to 

ab 

a  -I-  b 

as  was  originally  claimed. | 

Lemma  2.  (iiven  k  >  l  constraints  each  of  extent  a,  the  expected  range  of 
overlap  is 


a 

k’ 


Proof:  The  proof  is  by  induction.  Clearly  the  /c  =•  I  is  trivial.  The  k  ~  2  case 
follows  straightforwardly  from  Lemma  1.  Now  assume  the  induction  hypothesis  is 
true  for  k—  l  and  establish  it  for  k.  Ilv  Lemma  1,  adding  the  constraint  yields 
an  expected  range  of 


and  this  reduces  to 


a 

k 


thereby  establishing  the  induction  hypothesis.! 


This  second  lemma  implies  that,  in  the  three  degree  of  freedom  case  being 
discussed  here, 


lv;J 


n  v;J  < 

m  -  t  I  u  --  U 


Th..  ,  using  equation  (2),  the  expected  number  of  interpretations  is  bouniled  by 
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n  fc-t-  1  — 

e  n  — 

>  “='  i= 2  (>  -  l) 

If  we  lei,  t/  denote  this  normalized  volume 


V* 


at 


3  vp 


V  = 


V* . 

S  VfAl| 

VT 


then  the  expected  number  of  interpretations  is  bounded  by 


.ifc 


[nu'j 


n - =- . 

(/c!)3 


Since  it  is  well  known  that 


k\  >  I  - 


this  reduces  to  an  expected  upper  bound  of 

JT”k< 

k* 

This  is  clearly  a  much  tighter  bound  than  that  of  equation  (3).  Indeed,  for  fixed  n, 
this  behaves  like 


l  i 


and  the  k  ''k  term  is  extremely  important,  since  for  fixed  n  this  function  is  unimodul, 
rising  to  a  peak  for  some  value  of  k  dependent  on  the  values  of  the  constants,  and 
t hen  dramatically  decreasing  as  k  is  further  increased.  A  sample  plot  is  shown  in 
f  igure  3.  In  Figure  4,  the  log  of  the  number  of  expected  interpretations  is  plotted 
as  a  function  of  k,  with  each  plot  corresponding  to  a  doubling  of  the  constant  of 
the  previous  plot. 

It  is  worth  noting  that  exactly  this  type  of  behavior  was  observed  empirically 
by  .Crimson  and  Lozano- Perez  84 j.  In  Figure  5,  we  plot  the  number  of  consistent 
interpretations  as  a  function  of  the  number  of  data  points  for  an  actual  example. 
Note  that  the  form  of  the  graph  is  very  similar  to  that  of  1  igure  3.  The  fart 
that  the  number  of  interpretations  docs  to  asymptotically  approach  1  is  due  to 
symmetries  in  the  object  which  violate  the  assumptions  of  uniform  distribution 
of  faces  in  KC-spaoe.  ICvon  so,  the  performance  on  real  data  closely  matches  that 
predicted  by  theoretical  estimates. 

Finally,  it  is  clear  that  none  of  the  above  discussion  was  critically  dependent 
on  the  fact  that  we  were  dealing  with  the  two  dimensional  case,  although  it  was 
easy  to  visualize  in  this  case.  Similar  constraints  in  the  three  dimensional  case  will 
also  result  in  a  restriction  of  the  swept  volume  of  RC  spare,  appropriately  defined. 
Here,  the  spatial  components  of  the  RC-spacc  will  be  similarly  bound.  In  particular, 
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I'i^uro  A  .•'ample  plot  of  n'ck  for  fixed  n  and  c  and  increasing  k.  The  function  is 
unimodal,  peaking  for  some  small  value  of  k. 


Kigure  -I.  The  logarithm  of  the  expected  number  of  interpretations  plotted  as  a  function 
of  k.  1'arh  graph  corresponds  lo  a  doubling  of  the  constant  used  in  the  previous  plot. 

if  j  is  the  number  of  independent  constraints,  deGning  the  swept  volume  jV’^j 
then  the  expected  hound  on  the  number  of  interpretations  for  k  +  1  sensory  points 


iv  : TjU' r }>r« ‘t.il ions  for  .t  rc.u  i m  ;>,«>u >  u  . v '  i 

Vlir  form  of  the  t'.r.Ljili  should  In'  p.i r<  .1  to  r  ! j4 u r a. 


.hi  dimensional  t'.i.no,  we  nave  seen  that  j  -  -  3  anti  for  the  throe  uiiitem  a  o.o 


m.i  2  io  establish  thi-M-  bounds,  wo  implicitly 


i o.u  n.es  of  i v C  1 1  obi. lined  by  intersecting  the  swept  vo. times  (hom'd 


.  r.nt.' 


lit-;.-;  oe  non-empty  this  t  it  us  t  iioid  t  r,.e.  iti  eonr-a  ,  .1. 

,  .  . ....  . 


,-,,m  (ii  ;),<•  (  ur;oo;  u.t i  rpret  ;it  ton,  tint  lor  outer  cases,  u.o  iii;  erseci eit  voiumt.- 
t,  •  eii.piy.  This  implies  that  tighter  bounds  could  be  obi. lined  by  taking 
, t . ,  i,  r, .nut .  vine-  .n  such  rases,  the  expected  overlap  would  be  smaller  limn 
ex,  .oa  der.ved  m  l.emtna  2.  We  also  note  that  while  the  expected  vo.nmc 
,  i  bv  e.  plat  toil  V  tends  to  0  witii  increasing  k.  this  does  not  imply  that 
,  ne:  f  feasible  interpret. moils  also  tends  to  0.  Tins  .s  since  the  laces  ot 

,  t.  are  represent eu  'by  f  functions  m  ihts  space,  and  tinis  a  continuous 
iiaii  process  sue  ii  as  eijii.it  iuii  v  a  j  may  stiil  contain  a  6  function  w.tliin  its  scope, 
,  .'a  ;t.,  the  volume  doMnbcu  by  it  shrinks  to  an  iiiimitesimal.  >o  long  as  t n * ■  d 
...  uons  representing  d. intent  faces  are  not  arbitrarily  close  to  one  another,  lor 
, ,  i  tuple,  under  t  he  a.~sun.pi  nui  of  mu  form  dist  r  ibut  ion,  t  lie  asy  nipt  ot  le  red  net  ion  ol 
pint  ion  (a)  with  incre.mi:;’  v  does  imply  a  correspond  mg  ri'iiuniun  in  tin*  miii.bcr 
i  interpretations,  and  e. {nation  could  be  amended  by  adding  a  celling  lunction 
<>  the  entire  expression. 

1'in  illy,  we  stress  that  to  this  point,  we  nave  not  actually  used  the  speci'.c 
or  ins  i  1  t  he  pair  w  isc  constraints,  hut  only  t  tie  I  act  t  hat  such  innstr. nuts  rest  ru  t  'he 
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ranges  of  parameters  that  de-a-ribe  relationships  between  faces  of  a n  object.  Thus, 
provided  we  can  bound  the  volume  of  RC-space  spanned  by  such  a  constraint,  the 
results  derived  above  give  us  a  means  of  measuring  the  eilieieney  of  such  com' mints 
in  restricting  the  set  of  feasible  interpretations  of  the  data,  and  the  behavior  of  the 
bounds  will)  increasing  numbers  of  data  points  allow  us  to  measure  the  eilieieney 
of  the  constraints  in  restricting  the  search  space. 


4.2.  Predictions  of  t  he  Hounds 


Several  interesting  factors  can  be  derived  from  these  bounds.  We  already  have 
noted  that  this  bound  i-  a  unimode.!  function  in  k.  We  first  derivc  an  expression 
for  the  value  k\)  for  whh  h  'his  maximum  is  aeliieved.  Suppose  we  let 


I'1  k.  j  .  n 


/.7 


i 


denote  the  number  of  expected  interpretations  ror  k-*- 1  points  of  sensory  information, 

where  j  is  the  number  o*'  dimensions  of  the  R( '-space  that  are  explicitly  constrained 

by  the  local  constraints.  Ry  selling  •dm  derivative  of  the  logarithm  of  !C  to  zero,  we 

find  that  the  function  /.'  reaches  a  maximum  for 

,  -  .A 

mi  ---  nv  ]’ . 

At  this  point,  the  bound  on  the  expected  number  of  feasible  interpretations  is 
simply 

■  ?  •  1 1 
n  exp  j  inv 

so  that  this  places  a  bound  on  *he  maximum  number  of  interpretations  that  must 
be  considered  at  any  goint.  in  the  generation  of  the  tree  of  interpretations. 

Secondly,  wo  can  invert  this  bound  to  predict  the  number  or  data  points 
needed,  in  general,  to  guarantee  a  unique  determination  of  the  object's  orientation. 
In  particular,  given  a  value  of  !'  u\g.  /','  <  l)  we  can  use  Newton  s  method  to  solve 
for  the  number  of  da' a  points  needed  to  reduce  the  bound  in  equation  ,a)  below 
this  level.  While  numeric  solution  of  this  implicit  equation  for  k  will  yield  the  best 
results  for  specific  values  of  n,  j  and  v,  we  can  also  derive  art  upper  bound  on  the 
number  of  data  points  required  to  produce  a  unique  interpretation.  This  is  given 
by  a  solution  for  k  of  the  following  equation: 


or 


(a  :  [)  l0g  v  t-  jk  a-  k  log  v'  —  jk  log  k  <  0 
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where  we  have  replaced  the  variable  v *  by  the  variable  a  through  the  relationship 

v‘  —  na. 

One  met!  od  of  ensuring  that  this  expression  is  true  is  to  require  that  both  terms 
in  the  pr<  duct  on  the  left  hand  side  are  greater  than  1.  In  particular,  the  second 
term  then  reduces  to  the  inequality 
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i-  t  h«*  i'  r  1 1  ir :  1 1  t  art  or  m  ii**i  •  •  r  1 1 1 ;  u  >  j  1  j.T.  ; 1  •  t  •  graceful  dcgradat  ion  of  'he  constraints. 
1  ’no  aifd  t  Ins  expression  i-  •  mall.  the  relative  changes  in  those  parameters  arc  also 
small,  and.  thus  the  combma.t  ones  ot  'he  const  mints  degrades  «',r  •  r  <  • ! H ! !  \  Nuti:  that 
this  graeelul  dcgradat  ion  improves  as  the  number  of  independent  const  mints  j 
increases. 


5.  Specilic  Bounds 


We  now  turn  to  a  careful  consideration  of  the  specilic  c.onst  raints  that  imply 
to  our  particular  case,  and  derive  ■•xplieit  i  xperted  hounds  on  tile  1 1 1 1 : •  i 1  .•  < ■  r  of 
interpretations  consistent  with  -  1  sensory  points,  bv  estimating  * l;e  vo'utt.e 
parameter 


i.  1 .  'I’he  Two  I)i 


i  rnens'on; 


a'  Case 


We  will  first  consider  the  two-dimensional  ease  of  objects  with  three  degrees 
of  freedom  relative  to  their  model  coordinates,  and  will  then  extend  this  to  the 
general  'hrec  dimensional  <‘.w”  of  obiect<  having  s;x  degrees  of  positional  freedom. 

First,  we  will  make  the  following  assump'ions  about  the  object  being  sensed. 

!  A  'I  'he  edges  or  'he  obi'T?  hav<'  tl,e  same  length  /. 

The  diame’ er  of  the  objer'  is  denoted  by  the  constant  D. 

We  make  the  following  assump'ions  about  the  sensory  information. 

h  The  positions  or  con’ac*  with  'he  obicet  are  known  to  within  a  circle  of 
radms  r . 

2.  The  normals  to  the  sensed  edges  are  known  to  within  an  angular  error  of 


Now  suppose  that  we  arbitrarily  choose  some  face  on  which  to  place  (he  first 
contac*  point  (obviously  wc  have  no  constraints  on  this  anyway).  Given  a  second 
sense  point,  ’he  first,  quest -on  to  consider  i«  how  nianv  faces  are  consistent  with  t!\e 
constraints  betweeti  *  he  'wo  sensory  points.  To  answer  this,  we  first  review  what 
constraints  are  available  from  two  sensory  points. 

!.  W’e  can  measure  'to  wi'h’.n  some  error'  the  angle  0„  between  the  normal 
of  ♦»,„  <;r  .•  face  and  ‘he  norma*  o!  'he  second  face.  This  is  given  by  the 
combination  ol  knowing  the  do*  product  between  the  two  normals  and 
the  cross  product  of  ‘he  two  riorma’s  (which  is  the  simplified  version  of 
the  triple  product  ,-onst ramt  in  the  case  of  two  dimensions).  In  fact,  this 
measurement  is  ac<  nr.a'e  to  within  a  range  of  a  7. 

2  W’e  '-an  measure  'he  length  d  of  the  contact,  vector  v  between  the  'wo 

sei|-  . "|  pom's  'with  a  possible  error  of 
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d.  Finally,  we  can  determine  the  angle  between  the  contact  vector  v  and  the 
first  face  normal.  To  determine  the  range  of  possible  error,  we  note  that 
each  endpoint  of  v  lies  within  a  circle  of  radius  r.  It  is  straightforward 
to  show,  either  algebraically  or  geometrically,  that  the  maximum  angular 
deviation  of  the  true  contact  vector  from  its  measured  position  is  given 
by 


tan 


Our  general  technique  in  estimating  bounds  on  the  combinatorial  e/iiciency  of 
these  constraints  will  be  the  following: 

1.  Construct  a  relative  configuration  space  (RC-space)  centered  at  the 
midpoint  of  the  first  face. 

2.  Hound  the  volume  of  this  space  in  which  the  second  face  must  lie. 


H.  iistimate  the  number  of  faces  lying  within  this  volume.  To  do  this,  we  will 
assume  that  the  faces  are  uniformly  distributed  over  a  volume  of  dC  space 
within  D  of  the  midpoint  of  the  first  face.  While  tins  will  generally  be 
true  over  a  large  collection  of  objects,  clearly  for  any  single  oi.jei  .  this 
wiii  not  be  completely  vaiiu.  We  will  argue  that  the  bounds  derived  under 
this  assumption  are  almost  everywhere  bounds.  That  is.  while  there  may 
be  certain  degenerate  cases  which  violate  the  bounds,  in  general,  over  the 
class  of  possible  objects,  the  bounds  will  almost  always  be  true. 

I.  Kx; end  the  analysis  to  consider  the  combinatorial  effect  of  k  points  on  the 
number  of  faces  assignable  to  the  k  -t-  1st  point. 

We  arbitrarily  assign  a  face  to  the  first  sense  point,  and  construct  a  relative 
configuration  space  about  this  face.  The  orig  n  of  the  spatial  dimensions  of  the 
RC  '-space  lies  at  the  midpoint  of  the  face.  The  normal  of  this  face  delines  the  origin 
of  the  angular  dimension  of  the  RC-space.  Thus,  a  face  is  represented  in  RC-space 
by  trie  position  of  its  midpoint  relative  to  the  midpoint  of  the  base  face,  ami  by 
the  angle  of  its  normal  relative  to  the  normal  of  the  base  face. 

Suppose  that  the  base  of  the  contact  vector  v  lies  at  one  of  the  endpoints  of 
the  first  face.  What  positions  could  the  endpoint  of  v  take  in  RC-space?  First,  we 
know  that  the  angle  ol  this  vector  relative  to  the  sensed  normal  to  within  a  range 
tar,  (2 c/d)  and  we  know  that  the  variation  in  the  sensed  normal  relative  to  the 
face  normal  is  given  by  7.  Thus,  the  angle  made  by  the  contact  vector  relative  to 
the  fare  normal  is  defined  by  the  range 


Thus  the  area  of  RC-space  swept  out  by  the  endpoint  of  v  is  given  by 


which  evaluates  to 
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Figure  fi  The  swept  voliree  of  possible  posilions  for  a  sensory  point,  given  that  the  first 
sensory  point  lies  on  Hie  base  'ace. 

•if  7  -  tan  1 

V  a  J 

Tlin  total  area  of  UC-spaee  swept  out  as  the  base  point  of  v  is  moved  across  the 
base  face  is  diagrammed  in  Figure  6. 

In  order  to  ease  the  what  follows,  we  consider  the  problem  of  cnscribing  a 
rectangle,  with  sides  aligned  along  the  coordinate  axes  of  the  KC-spnce,  about  this 
volume.  We  let 

2e 

a  —  7  +  tan  — . 

d 

We  will  assume  that  7  <  *  and  that  we  only  consider  contact  points  such  that 
d  >  2r.  In  this  case,  0  <  <>  <  We  may  assume,  without  loss  of  generality,  that 
0[  1  lies  in  the  range  ■(),  H ].  In  order  to  determine  bounds  on  the  dimensions  of  the 
enscribing  rectangle,  we  consider  three  case. 

Consider  first  the  case  of  Oq  >  In  this  case,  one  can  show  that 

J/min  =  [d-t)*\n[0o  +  a) 
i/max  -  {d  J-  r)  sin  (0o  -  a) 

Tmin  =  (d  4  c)  cos  {f?0  +  a) 

•r m ax  —  (d  -  1)  cos  ( 0o  —  a)  +  l. 

Similarly,  if  Oq  <  3  —  r>  then 

ymin  =  (d-e)sin(Oo-  a) 

2/max  —  [d  +  c)sin(0o  +  cr) 
xmin  =  (d  ~()cos{00  +  a) 
xm;ix  ~  (d  -F  < )  cos  (Oq  —  or)  4- 1 , 
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and  if  '  —  a  <  0y  <  ^  ^  a  ^en 

ymm  =[d-()  inin  {sin  (0O  -  a),  sin  (0O  -r  a)} 

2/inax  —  {d  +  t) 

xn\in  ={d-t)cos{0Q  +  a) 

J" i n. ix  =  [d  r  t)  cos  (0O  -  a)  +  l. 

Thus,  for  example,  the  range  in  y  is  given  in  these  three  rases  by 

A y  =  —2d  cos  0()  sin  a  +  2 e  sin  do  cos  a  Case  i 

—  2d  r os  0y  sin  q  +  2c  sin  dy  cos  a  Case  2 

—  yd  -r  c)  —  [d  -  c)  min  (sin  (do  —  a),  sin  (do  -t-  a)}.  Case  3 

We  want  to  consider  the  maximum  extent  of  this  range.  Applying  standard 
minimization  techniques,  we  find  that  the  maximum  value  for  both  Case  1  and 
Case  2  are  given  by 

Ay  —  2 \'d2  sin2  a  +  c2  cos2  a 
and  for  Case  3,  the  maximum  occurs  at 

2 (a  sin2  a  -r  t  cos2  a). 

Moreover,  the  extremum  observed  in  Case  l  and  Case  2  is  always  greater  than  that 
of  Case  3,  so  we  can  bound  the  y  component  of  the  enscribing  rectangle  by 

Ay  =  2\/ d!-  sin2  a  +  e2  cos2  a. 

In  a  similar  manner,  we  find  that  we  can  bound  the  x  component  of  the 
enscribing  rectangle  by 

Ax  --  i  +  2\j d*  sin2  a+t2  cos2  a. 

This  gives  us  a  rectangular  bound  on  the  area  of  RC-space  in  which  the  second 
sensory  point  can  lie,  given  that  the  first  sensory  point  lies  somewhere  on  the  base 
face.  Since  a  face  is  described  in  RC-space  by  the  position  and  orientation  of  its 
midpoint,  relative  to  ihe  base  face,  wc  need  to  expand  this  volume  to  cover  the 
possible  positions  of  the  midpoints  of  the  face.  Clearly  the  most  straightforward 
way  to  do  this  is  to  expand  this  rectangle  by  £  on  all  sides,  since  a  sensed  point 
must  lie  within  that  distance  of  the  midpoint  of  the  face.  This  yields  an  area  in 
R( '-space  of  dimensions 


2f  +  2  \d-1  sin2  a  -+-  t2  cos2  a 


by 
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in  which  the  midpoint  of  a  face  must  lie  in  order  to  be  consistent  with  the  sensory 
information. 


Finally,  we  must  account  for  the  orientation  of  the  face.  Clearly,  the  angular 
component  of  RC-spare  has  extent  2n  and  the  range  of  possible  values  for  the 
orientation  of  the  second  face  relative  to  the  first  is  given  by  2q. 

This  defines  a  volume  in  RC-space  in  which  a  face  must  lie  in  order  to 
be  consistent  with  the  sensory  information,  lit  order  to  make  useful  estimates 
concerning  this  volume,  we  need  to  normalize  this  volume.  To  do  so.  we  will  make 
the  assumption  that  the  fares  have  equal  probability  of  lying  anywhere  within 
this  RC-space  (which  spans  an  area  ~  l)~  in  the  spatial  dimensions  and  2tr  in 
thi'  angular  dimension).  Clearly  this  assumption  is  not  necessarily  valid  for  arty 
particular  object,  although  when  averaged  over  all  possible  orientations  of  the 
object,  it  becomes  more  accurate.  Irt  this  ease,  the  normalized  volume  of  RC-space 
in  which  a  face  must  lie  is  bounded  above  by 

\’>  .Si  X  S2 


where 
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The  number  of  possible  faces  lying  within  this  volume  is  given  by 


nVj. 

Note  that,  as  expected,  V,  is  a  dimensionless  quantity,  depending  only  of  ratios 
of  parameters  of  the  polygonal  object.  Also,  we  can  restrict  our  computation  to 
cases  where  d  2c  so  that  o  can  be  bounded  above  by  7  +  *-• 

This  normalized  volume  leads  to  a  straightforward  bound  on  the  number  of 
interpretations  consistent  with  k  -J-  l  sense  points,  namely 


n[nV2}k . 

Of  course,  this  ignores  much  of  the  combinatorial  constraint  available  to  the 
technique,  since  it  comes  from  simply  using  the  constraints  between  the  first  sense 
point  and  all  additional  sense  points  but  ignores  the  fact  that  the  fcltl  face  is 
constrained  by  all  k  1  previous  sense  points,  and  not  just  the  first.  Nonetheless, 
under  the  assumptions  made  previously,  this  bound  represents  a  worst  rase  bound, 
in  which  the  inclusion  of  additional  sense  points  dors  not  reduce  the  normalized 
volume  of  RC-space  specified  by  the  constraints  between  the  Grst  and  the  current 
sense  points. 

We  note  that  for  fixed  n,  the  expression  for  the  constant  nVb  actually  defines 
a  relationship  between  the  two  error  ranges  as  a  function  of  the  object  parameters. 
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Tims,  'l  describes  l  fir  tram-ofl  brUvceii  the  err...*  m  , >i, - 1 1 1» .  am..- ur-  :a.  a’  -  .u.d 
iiurtii.il  measurement*  to  guarantee  convergent  <■  ui  the  •  wm;.:;iai..rii  -  with 

increasing  sensed  point .  No  to  t  h:il  prod  not  rol.it  mu  ship  (>«••  v\  ■  hes.  or  r.  .rs  kes 
mmim’.  since  an  increase  in  errors  in  position  should  require  a  m-crcase  ;n  .-rn.rs  m 
normals  in  order  to  maintain  the  same  iiuinbor  oi  interpret .d  ions.  .inn  vuo  versa. 

1' h is  hound  is  actually  quite  weak,  however,  ■once  it  io.se*  a  gre.it  ma.i  of  the 
comliin.itori.il  power  oi  combining  dilTerent  const  mints.  t  w,,.  ,mr  me  *•  •  i'1  point.. 
Not  on.'.'  should  the  jiairwi.se  constraints  between  the  first  sen-. si  point  ..mi  tin: 
A  --  1  1  jioiiil  restnet  tne  Voiui.iO  ol  H(  -sj.,ui‘  in  Wtncn  lilt  s  .  I  lace  e.i .a’,  but 
so  should  t  tie  pairwise  constraints  between  the  second  point  .mo  the  k  •  l"1  point 
and  so  >n.  Ulilie  all  oi  tiie-e  vi.lumes  cou ui  be  uient.cai  ->:i  . . . e r . i ; * , ■  ,•>  ,<  i  t  the 

volume:,  to  only  parti. -.dy  overlap.  Thus,  t ii e  voiuine  lor  tt.e  K  •  i’1  ,'ace  snoi.ni  oe 
tim  intersection  of  all  oi  these  voiutues.  A  bound  on  tins  inters,  emu  volume  is  given 
t>y  beminas  !  and  2.  Thus,  .or  example,  trie  angular  component  for  the  k  -  l  '1  is 
rcu ueee  ey  a  factor  i)I  a.'. 
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5.2.  Tin'  Threi*  Dimensional  Case 

We  (  .in  expand  the  derivation  of  (lie  previous  section  to  deal  with  the  three 
dnii.  nsion.il  case.  Here,  the  faces  are  squares  of  side  f,  rather  than  one  dimensional 
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edges.  It  is  st ruight forw ard  to  show  that  the  positional  aspects  of  the  derivation 
cart  he  decomposed  into  and  y  components,  both  yielding  normalized  ranges  of 
size  s  i ,  white  the  c  ro'nponent  has  a  normalized  range  given  by  sj.  Only  one  of  the 
two  rotational  parameters  ran  he  const  rained  by  the  measurements  between  the 
two  unit  surface  normals,  so  that  'he  normalized  volume  in  this  case  is  given  by 
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An  expected  hound  can  tie  derived  in  a  manner  similar  to  the  two  dimensional  ease, 
where  now  the  diagonal  of  the  three  dimensional  rube  is  given  by 


Here  the  expected  volume  is 
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We  have  yet  to  apply  :1m  'rude  product  constraint  Crimson  and  Lozano  Perez 
los  t  .  however.  Whin*  this  con-*  ram*  has  no  app,ic;t,'on  to  pairs  of  points,  for 
triples  of  points  it  do*',  ipp'y  In  par' 'filar  the  following  ease  tm'ds.  Consider 
three  independen'  uni*  normC'  n* .  trn.  n  and  assume  we  know  the  dot  products 
between  each  of  then',  d'mn'rd  by  *,,,  as  \s *  ’ !  ;i>  the  sign  of  'dm  triple  product 
n ;  n.'ii.c .  Then 
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where  u  is  a  unit  vector  such  'hat  u  •  n ,  --  ().  Similarly 
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where  w  i«  a  uni'  vector  simh  that  w-tij  0.  Now  'he  do‘  product  between  those 
vectors  is  known  by  measurement, 
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—  £23- 

Thus,  the  cosine  of  the  angle  between  u  and  w  is  determined,  say  w  •  u  —  cos0. 
Kvaluating  ttie  triple  product  gives 

'ii|Hjn;)]  —  -  e^sin# 

and  thus  the  angle  0  is  determined.  This  implies  that  while  there  is  a  free  degree  of 
freedom  between  ri|  and  ttj.  corresponding  to  a  ro*  at  ion  about  rij,  the  attitude  of 
ri;j  is  completely  determined  from  the  measured  dot  products  and  triple  product. 

This  implies  that  while  the  expected  volume  for  the  second  point  is  given  by 

7  3'* 
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for  additional  points,  the  second  rotational  degree  of  freedom  is  also  restricted  and 
the  expected  volume  is  given  by 
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b.K.  Degradation  w’uh  Noise 

Havint*,  derived  specific  bounds  on  the  effective  printing  of  the  local  eons)  mints, 
it  i.s  easy  to  see  li(>\v  t':’-  prr-»nce  of  sensor  error  a  fleets  the  expected  number  of 
ronsisia  n i  hv  pot  t.e-es.  lo  <i <  motisl rate  the  graceful  degradation  of  the  constraints, 
e.<  r i . i v i  pit, tied  the  logarithm  oi  the  number  of  expected  feasible  interpretations 
a'  i  i  unci  ion  of  ttie  number  of  sensory  points,  lor  several  different  error  con o it  ions. 

in  Figure  7.  \vi  plot  the  number  of  interpret  at  ions  tvhiic  varying  the  amount 
■!  crr>,r  in  measuring  surface  positions.  Kadi  plot  corresponds  to  a  doubling  of  this 
;  ,1'giojiai  error  from  the  previous  one. 

Ir,  i  .pure  S,  we  plot  tiic  nun. her  of  interpretations  while  varving  the  amount  of 
rror  ;n  measuring  sun. tee  nr,,  nl  at  ion .  Kadi  plot,  corresponds  to  a  doubling,  of  ft, is 
mp.uiar  error  from  the  previous  one.  In  both  rases,  we  see  that  the  combinatorics 
'ere  r.iiiv  decrease  in  a  graeel’ui  manner. 


Halin'  H.  Tho  ! *  l»m  o'  f!?o  ri * i mix' r  of  int(Tpr»it;it  ;ojim  ;l'»  a  ftinrlioti  of 

fi'inihor  of  ^rn^iry  p->iM*  i!lr>'  raN's  t!u-  t'TorN  of  doublim;  the  amount  o^  orror  :\ 

measuring  ^urfnco  oneri*  I'tor,  w,'-t  a!!  <»',  n«-r  par anirt -  ts 

6.  Applicationof  the  Theory 

\VV  beg  an  our  investigation  of  the  problem  o<"  model-based  recognition  and 
localization  by  establishing  a  sot  of  criteria  that,  should  be  saiisf-ed  1  >  v  constraints 
between  sensory  tla'a  aad  mod'-'  •Inuen's.  'a  particular,  \v>  argued  'hat  the 
constraints  should  be  coordinat*-  fr  ( independent,  -ample.  ■•i-ieer  independent, 
combinatoriaMy  powe.lu!  and  deer  de  gracehi’ly  wi'h  error.  Wo  have  spent  the 
bulk  of  tins  paper  deriving  —>rh  a  s.-r  t,r  const  rain's  an,;  <vtablishinr,  on  theoretical 
grounds  that  these  criter-a  are  «a' t-  hed.  This  1 . h.eoretica!  basis  was-  ;>n  outgrowth 
of  earlier  work  ( irimson  and  ho/atio-lV-re/  10S-J  in  which  an  isomorphic  --et  of 
constraints  was  proposed  and  tested. 

As  further  support  lor  the  theoretical  results  reported  here,  we  briefly  review 
some  of  the  results  of  testing  that  algorithm,  fn  particular,  a  large  set  of  simulations 
were  run  on  a  series  of  test,  objects,  for  var\ mg  types  of  error  conditions.  The 
following  '  tide  summnr'zes  some  of  these  simulations,  described  in  wore  detail  in 
(Irimson  and  Lozano- Perez  8-!',  for  the  objects  illustrated  in  Figure  9. 
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'Thus,  i’or  example,  in  the  ca.se  of  the  object  illustrated  in  Figure  10,  a  mure  accurate 
bound  would  be  given  by  using  j  —  2  rather  than  the  more  general  j  —  3. 


Finally,  the  key  results  of  the  analysis  presented  here  are  in  large  part 
independent  of  the  specific  constraints  used  to  restrict  the  search  space.  This 
suggests  that  a  similar  analysis  using  a  swept  volumes  in  a  re.ativc  conligurat ion 
.-pace  may  oe  applicable  to  other  cases  ot  consistent  mter^ri-tat  .on  o!  n.usy, 
measured  data,  For  example,  extensions  of  the  technique  to  curved  surface.-,  ana  to 
other  types  of  iocai  sensory  tneasuremenis  remain  interesting  questions  for  ore 
researen. 
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